k-mw-modes: An algorithm for clustering categorical matrix-object data
نویسندگان
چکیده
In data mining, the input of most algorithms is a set of n objects and each object is described by a feature vector. However, in many real database applications, an object is described by more than one feature vector. In this paper, we call an object described by more than one feature vector as a matrix-object and a data set consisting of matrix-objects as a matrix-object data set. We propose a k-multi-weighted-modes eywords: ategorical data atrix-object -mw-modes algorithm (abbr. k-mw-modes) algorithm for clustering categorical matrix-object data. In this algorithm, we define the distance between two categorical matrix-objects and a multi-weighted-modes representation of cluster prototypes is proposed. We give a heuristic method to choose the locally optimal multi-weightedmodes in the iteration of the k-mw-modes algorithm. We validated the effectiveness and benefits of the k-mw-modes algorithm on the five real data sets from different applications. © 2017 Elsevier B.V. All rights reserved.
منابع مشابه
An Optimization K-Modes Clustering Algorithm with Elephant Herding Optimization Algorithm for Crime Clustering
The detection and prevention of crime, in the past few decades, required several years of research and analysis. However, today, thanks to smart systems based on data mining techniques, it is possible to detect and prevent crime in a considerably less time. Classification and clustering-based smart techniques can classify and cluster the crime-related samples. The most important factor in the c...
متن کامل3D Object Retrieval Based on PSO-K-Modes Method
By use of semantic attributes of 3D object, the user can search for targeted objects, which main advantage is that it does not require the user to sketch a 3D object as the query for 3D object retrieval, and the retrieval system can obtain a better retrieval performance. There are many categorical datum among these attributes, and how to use those and find the most similar objects is a vital pr...
متن کاملGenetic Distance Measure for K-modes Algorithm
K-means algorithm has been shown to be an effective and efficient algorithm for clustering. However, the k-means algorithm is developed for numerical data only. It is not suitable for the clustering of non-numerical data. K-modes algorithm has been developed for clustering categorical objects by extending from the k-means algorithm. However, no one applies this technique for classification of c...
متن کاملImproving K-Modes Algorithm Considering Frequencies of Attribute Values in Mode
The original k-means algorithm is designed to work primarily on numeric data sets. This prohibits the algorithm from being applied to categorical data clustering, which is an integral part of data mining and has attracted much attention recently. The k-modes algorithm extended the k-means paradigm to cluster categorical data by using a frequency-based method to update the cluster modes versus t...
متن کاملImproved K-Modes for Categorical Clustering Using Weighted Dissimilarity Measure
K-Modes is an extension of K-Means clustering algorithm, developed to cluster the categorical data, where the mean is replaced by the mode. The similarity measure proposed by Huang is the simple matching or mismatching measure. Weight of attribute values contribute much in clustering; thus in this paper we propose a new weighted dissimilarity measure for K-Modes, based on the ratio of frequency...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Appl. Soft Comput.
دوره 57 شماره
صفحات -
تاریخ انتشار 2017